Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable IOMMUFD for VM device passthrough #4004

Merged
merged 1 commit into from
Nov 5, 2024

Conversation

fanchen2
Copy link
Contributor

iommufd-backed VFIO, for example:
-object iommufd,id=iommufd0
-device vfio-pci,host=01:00.0,iommufd=iommufd0
Multiple pcie device can use the same iommufd.


def iommufd_object_define_by_params(self, obj_id):
"""
Create iommufd object device by params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before creating an iommufd device, it is better to check that /dev/iommu exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your quick response, will do.

@@ -4031,9 +4031,25 @@ def hostdev_define_by_params(self, name, params, bus=None):
"failover_pair_id": params.get("vm_hostdev_failover_pair_id"),
}
)
# Use hard code 'iommufd0' here, multiple devices use the same iommufd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you mentioned here, multiple devices can use the same one. How about checking the iommufd as an ID? If the iommufd device exists with the same ID, reuse it; if not, create a new one, cause we should also support using different iommufd devices, what do you think?

@@ -4031,9 +4031,25 @@ def hostdev_define_by_params(self, name, params, bus=None):
"failover_pair_id": params.get("vm_hostdev_failover_pair_id"),
}
)
# Use hard code 'iommufd0' here, multiple devices use the same iommufd
if params.get("iommufd") == "yes":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this code block, all parameters use vm_hostdev prefix, I would suggest align with this.

@@ -4031,9 +4031,25 @@ def hostdev_define_by_params(self, name, params, bus=None):
"failover_pair_id": params.get("vm_hostdev_failover_pair_id"),
}
)
# Use hard code 'iommufd0' here, multiple devices use the same iommufd
if params.get("iommufd") == "yes":
dev_params["iommufd"] = "iommufd0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the iommufd is only used for VFIO devices, so I have an idea here,

devs = []
...
...
fd_id = params.get("vm_hostdev_iommufd")
if fd_id and not self.get_by_qid(fd_id):
    iommufd = iommufd_object_define_by_params()
    dev_params["iommufd"] = fd_id
    devs.append(iommufd)
...
...
dev = qdevices.QDevice(driver, dev_params, parent_bus=dev_bus)
devs.append(dev)
return devs

If you have any concern, please don't hesitate to raise them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I think I understand what you mean, will try to follow this solution to update the PR, thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PaulYuuu Updated, and updated PR autotest/tp-qemu#4163 to match the new design. Could you review again?
Some explanation:
To support multiple devices with multiple iommufd, we need to map iommufd and hostdev, params.get("vm_hostdev_iommufd") will get multiple devices, so we cannot use it directly, we have to map them in a loop. But we don't know the order of current hostdev in hostdev_define_by_params. And we still need to keep the case: multiple hostdev with the same iommufd. So we need "if else" and "break" in this loop to map them carefully.

  1. For multiple hostdev with single iommufd case, we need to attach iommufd to the 2nd, 3rd,... hostdevs, so we need the "else", because the iommufd object has been created once for the 1st hostdev
  2. For multiple hostdev with multiple iommufd case, when it comes to the next iommufd for the same hostdev in loop, we should not rewrite the iommufd id, so we need "break". "else" is redundant here, but it will be overwritten when it comes to next hostdev, which we hope next iommufd can be used, so it doesn't matter, we have to keep it for case 1).

Thank you so much.

# To support single/multiple iommufd for single/multiple hostdev
for fd_id in iommufd_id:
if not self.get_by_qid(fd_id):
iommufd = self.iommufd_object_define_by_params(fd_id)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iommufd maybe create fail, right? for example IOMMUFD driver not loaded and no /dev/iommu.

It needs a check for object before adding iommufd for device.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xhao22 Thanks for reminder, updated in iommufd_object_define_by_param, raise error if /dev/iommu doesn't exist.


devs = []

iommufd_id = params.objects("vm_hostdev_iommufd")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is for all vfio devices, but each device has its namespace. like I can set vm_hostdev_iommufd_hostdev1 = iommufd0 and vm_hostdev_iommufd_hostdev2 = iommufd1. As we know, every vfio device can only use 1 iommufd object, so let this parameter be used for device namespace can avoid using break.

iommufd_id = params.get("vm_hostdev_iommufd")
if iommufd_id:
    dev_params["iommufd"] = iommufd_id
    if not self.get_by_qid(iommufd_id):
        iommufd = self.iommufd_object_define_by_params(iommufd_id)
        devs.append(iommufd)

But if you have other requirements like negative test, please let me know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PaulYuuu
This break is just for vm_hostdev_iommufd_hostdev1 = iommufd0 and vm_hostdev_iommufd_hostdev2 = iommufd1.
Because this function hostdev_define_by_params will be called by each device, for hostdev2, it will be called again.
But we use "for" in this function to check all iommufd id we set in vfio_net_boot.cfg in tp_qemu:

        # multi hostdev will be bound to different iommufd
        - multi_iommufd:
            vm_hostdev_iommufd = iommufd0 iommufd1

So without break, for hostdev1, it will be set to vm_hostdev_iommufd_hostdev1 = iommufd0 in first iteration, and then be overwritten to vm_hostdev_iommufd_hostdev1 = iommufd1 in the next interation.
Then the para will be like:
vm_hostdev_iommufd_hostdev1 = iommufd1(iommufd0 will be overwritten in this loop)
vm_hostdev_iommufd_hostdev2 = iommufd1

And we cannot map iommufd outside hostdev_define_by_params function
So, I just think we can use "break" here, I have tested it, the para can be set as we expected.

Copy link
Contributor

@PaulYuuu PaulYuuu Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I want to test 2 vfio devices, 1 with iommufd, 1 without, for the current design, this is impossible, causes the only one iommufd object will attend to all vfio devices.

So without break, for hostdev1, it will be set to vm_hostdev_iommufd_hostdev1 = iommufd0 in first iteration, and then be overwritten to vm_hostdev_iommufd_hostdev1 = iommufd1 in the next interation.
Then the para will be like:
vm_hostdev_iommufd_hostdev1 = iommufd1(iommufd0 will be overwritten in this loop)
vm_hostdev_iommufd_hostdev2 = iommufd1

As I mentioned, vfio devices have their namespace during the device creation. in qemu_vm.py

        for hostdev in params.objects("vm_hostdevs"):
            hostdev_params = params.object_params(hostdev)
            dev = devices.hostdev_define_by_params(hostdev, hostdev_params)

object_params will cut out vm_hostdev_iommufd_hostdev1 = iommufd0 to vm_hostdev_iommufd = iommufd0 before hostdev_define_by_params. When creating hostdev2, vm_hostdev_iommufd_hostdev2 = iommufd1 will be vm_hostdev_iommufd = iommufd1, so hostdev_define_by_params can always handle the correct iommufd definition. And if you want to test all vfio with same iommufd, just set vm_hostdev_iommufd = iommufd0.

# with same iommufd
vm_hostdev_iommufd = iommufd0

# with diff iommufd
vm_hostdev_iommufd_hostdev1 = iommufd0
vm_hostdev_iommufd_hostdev2 = iommufd1

You can read the context from qemu_vm and also check object_params func.

Copy link
Contributor Author

@fanchen2 fanchen2 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @PaulYuuu ,to double confirm, do you mean we can fix the iommufd id for each device in vfio_net_boot.cfg like this?

# with diff iommufd
vm_hostdev_iommufd_hostdev1 = iommufd0
vm_hostdev_iommufd_hostdev2 = iommufd1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PaulYuuu Thanks for your advice, updated this PR, and updated autotest/tp-qemu#4163 as well, could you help to check again?

:return: the iommufd QObject device
"""
if not os.path.exists("/dev/iommu"):
raise ValueError("iommufd is not supported, modprobe iommufd to enable it")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @fanchen2, can you check creating an iommufd VM without modprobe iommufd? IIUC, qemu will automatically load it when you try to attend a iommufd to vfio. If this works, we can bypass this.

Sorry for my previous comment confusion here. But yes, modprobe in advance is good.

Copy link
Contributor Author

@fanchen2 fanchen2 Oct 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PaulYuuu
iommufd cannot be removed directly, it has dependency with vfio.
To rmmod iommufd, we need to rmmod vfio_pci vfio_pci_core vfio_iommu_type1 vfio first.
And when modprobe vfio(which is already be checked and loaded by avocado-vt), iommufd will be loaded together. So, we cannot try with vfio loaded but without iommufd loaded.
If modprobe iommufd without vfio, iommufd can be loaded, and /dev/iommu can be found, but vfio will not be loaded together.
So in this case, looks like we can skip this check? What do you think? If so, let me remove it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, thank you.

Signed-off-by: Farrah Chen <[email protected]>
Signed-off-by: Xudong Hao <[email protected]>
@PaulYuuu
Copy link
Contributor

Hello @luckyh @YongxueHong, any concern about this PR?

@YongxueHong
Copy link
Contributor

CC @zhencliu
Could you help to review it?
Thanks.

@luckyh
Copy link
Contributor

luckyh commented Nov 5, 2024

Thanks to all.

@luckyh luckyh merged commit 0595c93 into avocado-framework:master Nov 5, 2024
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants